Data2Vec is a general self-supervised learning framework applicable to speech, vision, and language processing. This model is a speech recognition model pre-trained and fine-tuned on 960 hours of LibriSpeech audio data.
Speech Recognition
Transformers English